Train Feedfoward Neural Network with Layer-wise Adaptive Rate via Approximating Back-matching Propagation
نویسندگان
چکیده
Stochastic gradient descent (SGD) has achieved great success in training deep neural network, where the gradient is computed through backpropagation. However, the back-propagated values of different layers vary dramatically. This inconsistence of gradient magnitude across different layers renders optimization of deep neural network with a single learning rate problematic. We introduce the back-matching propagation which computes the backward values on the layer’s parameter and the input by matching backward values on the layer’s output. This leads to solving a bunch of least-squares problems, which requires high computational cost. We then reduce the backmatching propagation with approximations and propose an algorithm that turns to be the regular SGD with a layer-wise adaptive learning rate strategy. This allows an easy implementation of our algorithm in current machine learning frameworks equipped with auto-differentiation. We apply our algorithm in training modern deep neural networks and achieve favorable results over SGD.
منابع مشابه
Prediction of the Liquid Vapor Pressure Using the Artificial Neural Network-Group Contribution Method
In this paper, vapor pressure for pure compounds is estimated using the Artificial Neural Networks and a simple Group Contribution Method (ANN–GCM). For model comprehensiveness, materials were chosen from various families. Most of materials are from 12 families. Vapor pressure data of 100 compounds is used to train, validate and test the ANN-GCM model. Va...
متن کاملOptimizing of Iron Bioleaching from a Contaminated Kaolin Clay by the Use of Artificial Neural Network
In this research, the amount of Iron removal by bioleaching of a kaolin sample with high iron impurity with Aspergillus niger was optimized. In order to study the effect of initial pH, sucrose and spore concentration on iron, oxalic acid and citric acid concentration, more than twenty experiments were performed. The resulted data were utilized to train, validate and test the two layer artificia...
متن کاملAn Optimal Utilization of Cloud Resources using Adaptive Back Propagation Neural Network and Multi-Level Priority Queue Scheduling
With the innovation of cloud computing industry lots of services were provided based on different deployment criteria. Nowadays everyone tries to remain connected and demand maximum utilization of resources with minimum timeand effort. Thus, making it an important challenge in cloud computing for optimum utilization of resources. To overcome this issue, many techniques have been proposed ...
متن کاملSEISMIC DESIGN OF DOUBLE LAYER GRIDS BY NEURAL NETWORKS
The main contribution of the present paper is to train efficient neural networks for seismic design of double layer grids subject to multiple-earthquake loading. As the seismic analysis and design of such large scale structures require high computational efforts, employing neural network techniques substantially decreases the computational burden. Square-on-square double layer grids with the va...
متن کاملGlobally Tuned Cascade Pose Regression via Back Propagation with Application in 2D Face Pose Estimation and Heart Segmentation in 3D CT Images
Recently, a successful pose estimation algorithm, called Cascade Pose Regression (CPR), was proposed in the literature. Trained over Pose Index Feature, CPR is a regressor ensemble that is similar to Boosting. In this paper we show how CPR can be represented as a Neural Network. Specifically, we adopt a Graph Transformer Network (GTN) representation and accordingly train CPR with Back Propagati...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1802.09750 شماره
صفحات -
تاریخ انتشار 2018